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Abstract 



We consider a class of multi-stage robust covering problems, where additional information is 
revealed about the problem instance in each stage, but the cost of taking actions increases. The 
dilemma for the decision-maker is whether to wait for additional information and risk the inflation, 
or to take early actions to hedge against rising costs. We study the "fc-robust" uncertainty model: in 
each stage i — 0, 1, . . . , T, the algorithm is shown some subset of size fcj that completely contains the 
eventual demands to be covered; here k\ > k 2 > ■ ■ ■ > kx which ensures increasing information over 
time. The goal is to minimize the cost incurred in the worst-case possible sequence of revelations. 

For the multistage k-robust set cover problem, we give an 0(logm + log n)-approximation al- 



stages. Moreover, our algorithm has a useful "thrifty" property: it takes actions on just two 
stages. We show similar thrifty algorithms for multi-stage fc-robust Steiner tree, Steiner forest, and 
minimum-cut. For these problems our approximation guarantees are 0(min{T, log n, log A max }), 
where A max is the maximum inflation over all the stages. We conjecture that these problems also 
admit 0(l)-approximate thrifty algorithms. 

1 Introduction 

This paper considers approximation algorithms for a set of multi-stage decision problems. Here, addi- 
tional information is revealed about the problem instance in each stage, but the cost of taking actions 
increases. The decision-making algorithm has to decide whether to wait for additional information 
and risk the rising costs, or to take actions early to hedge against inflation. We consider the model of 
robust optimization, where we are told what the set of possible information revelations are, and want 
to minimize the cost incurred in the worst-case possible sequence of revelations. 

For instance, consider the following multi-stage set cover problem: initially we are given a set system 
(U, J 7 ). Our eventual goal is to cover some subset A C U of this universe, but we don't know this 
"scenario" A up- front. All we know is that A can be any subset of U of size at most k. Moreover we 
know that on each day i, we will be shown some set Ai of size ki, such that A{ contains the scenario 
A — these numbers ki decrease over time, so that we have more information as time progresses, until 
nJ =0 Ai = A. We can pick sets from T toward covering A whenever we want, but the costs of sets 
increase over time (in a specified fashion). Eventually, the sets we pick must cover the final subset A. 
We want to minimize the worst-case cost 
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hardness of approximation [4] even for T = 2 



max 

a=(A 1 ,A 2 ,...,A T ):\At\=k t Vt 



total cost of algorithm on sequence a 



(1.1) 
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This is a basic model for multistage robust optimization and requires minimal specification of the 
uncertainty sets (it only needs the cardinality bounds kis). 

Robust versions of Steiner tree/forest, minimum cut, and other covering problems are similarly defined. 
This tension between waiting for information vs. the temptation to buy early and beat rising costs 
arises even in 2-stage decision problems — here we have T stages of decision-making, making this more 
acute. 

A comment on the kind of partial information we are modeling: in our setting we are given pro- 
gressively more information about events that will not happen, and are implicitly encouraged (by the 
rising prices) to plan prudently for the (up to k) events that will indeed happen. For example, consider 
a farmer who has a collection of n possible bad events ( "high seed prices in the growing season" , { "no 
rains by month i"}f =1 , etc.), and who is trying to guard against up to k of these bad events happening 
at the end of the planning horizon. Think of k capturing how risk-averse he is; the higher the k, the 
more events he wants to cover. He can take actions to guard against these bad events (store seed 
for planting, install irrigation systems, take out insurance, etc.). In this case, it is natural that the 
information he gets is about the bad events that do not happen. 

This should be contrasted with online algorithms, where we are only given events that do happen — 
namely, demands that need to be immediately and irrevocably covered. This difference means that 
we cannot directly use the ideas from online competitive analysis, and consequently our techniques 
are fairly different. 1 A second difference from online competitive analysis is, of course, in the ob- 
jective function: we guarantee that the cost incurred on the worst-case sequence of revelations is 
approximately minimized, as opposed to being competitive to the best series of actions for every set 
of revelations — indeed, the rising prices make it impossible to obtain a guarantee of the latter form in 
our settings. 

Our Results. In this paper, we give the first approximation algorithms for standard covering prob- 
lems (set cover, Steiner tree and forest, and min-cut) in the model of multi-stage robust optimization 
with recourse. A feature of our algorithms that make them particularly desirable is that they are 
"thrifty" : they actually take actions in just two stages, regardless of the number of stages T. Hence, 
even if T is polynomially large, our algorithms remain efficient and simple (note that the optimal de- 
cision tree has potentially exponential size even for constant T). For example, the set cover algorithm 
covers some set of "dangerous" elements right off the bat (on day 0), then it waits until a critical day 
t* when it covers all the elements that can concievably still like in the final set A. We show that this 
set-cover algorithm is an 0(logm + log n)-approximation, which almost matches the hardness result 
of 0(logn + r^£™-) [4] for T = 2. 

\ o log log m 1 L J 

We also give thrifty algorithms for three other covering problems: Steiner tree, Steiner forest, Min- 
cut — again, these algorithms are easy to describe and to implement, and have the same structure: 

We find a solution in which decisions need to be made only at two points in time: we cover 
a set of dangerous elements in stage (before any additional information is received) , and 
then we cover all surviving elements at stage t*, where t* = argmin t X t k t . 

For these problems, the approximation guarantee we can currently prove is no longer a constant, but 
depends on the number of stages: specifically, the dependence is 0(min{T, log n, log A max }), where 
Amax is the maximum inflation factor. While we conjecture this can be improved to a constant, 
we would like to emphasize that even for T being a constant more than two, previous results and 

1 It would be interesting to consider a model where a combination of positive and negative information is given, i.e., 
a mix of robust and online algorithms. We leave such extensions as future work. 
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techniques do not imply the existence of a constant-factor approximation algorithm, let alone the 
existence of a thrifty algorithm. 

The definition of "dangerous" in the above algorithm is, of course, problem dependent: e.g., for set 
cover these are elements which cost more than Opt/kf* to cover. In general, this defintion is such 
that bounding the cost of the elements we cover on day t* is immediate. And what about the cost we 
incur on day 0? This forms the technical heart of the proofs, which proceeds by a careful backwards 
induction over the stages, bounding the cost incurred in covering the dangerous elements that are 
still uncovered by Opt after j stages. These proofs exploit some net-type properties of the respective 
covering problems, and extend the results in Gupta et al. [6]. While our algorithms appear similar 
to those in [6], the proofs require new technical ideas such as the use of non- uniform thresholds in 
defining "nets" and proving properties about them. 

The fact that these multistage problems have near-optimal strategies with this simple structure is quite 
surprising. One can show that the optimal solution may require decision-making at all stages (we show 
an example for set cover in Section B.l). It would be interesting to understand this phenomenon 
further. For problems other than set cover (i.e., those with a performance guarantee depending on T), 
can we improve the guarantees further, and/or show a tradeoff between the approximation guarantee 
and the number of stages we act in? These remain interesting directions for future research. 

We also observe in Section B.2 that thrifty algorithms perform poorly for multistage robust set-cover 
even on slight generalizations of the above "/c-robust uncertainty sets". In this setting it turns out 
that any reasonable near-optimal solution must act on all stages. This suggests that the /c-robust 
uncertainty sets studied in this paper are crucial to obtaining good thrifty algorithms. 

Related Work. Demand-robust optimization has long been studied in the operations research liter- 
ature, see eg. the survey article by Bertsimas et al. [2] and references therein. The multistage robust 
model was studied in Ben-Tal et al. [1]. Most of these works involve only continuous decision variables. 
On the other hand, the problems considered in this paper involve making discrete decisions. 

Approximation algorithms for robust optimization are of more recent vintage: all these algorithms 
are for two-stage optimization with discrete decision variables. Dhamdhere et al. [3] studied two-stage 
versions when the scenarios were explicitly listed, and gave constant-factor approximations for Steiner 
tree and facility location, and logarithmic approximations to mincut/multicut problems. Golovin 
et al. [5] gave 0(l)-approximations to robust mincut and shortest-paths. Feige et al. [4] considered 
implicitly specified scenarios and introduced the /c-robust uncertainty model ("scenarios are all subsets 
of size k"); they gave an 0(logmlogn)-approximation algorithm for 2-stage /c-robust set cover using 
an LP-based approach. Khandekar et al. [8] gave 0(l)-approximations for 2-stage /c-robust Steiner 
tree, Steiner forest on trees and facility location, using a combinatorial algorithm. Gupta et al. [6] 
gave a general framework for two-stage /c-robust problems, and used it to get better results for set 
cover, Steiner tree and forest, mincut and multicut. We build substantially on the ideas from [6]. 

Approximation algorithms for multistage stochastic optimization have been given in [9, 7]; in the 
stochastic world, we are given a probability distribution over sequences, and consider the average cost 
instead of the worst-case cost in (1.1). However these algorithms currently only work for a constant 
number of stages, mainly due to the explosion in the number of potential scenarios. The current paper 
raises the possibility that for "simple" probability distributions, the techniques developed here may 
extend to stochastic optimization. 

Notation. We use [T] to denote {0, • • • , T}, and (^) to denote the collection of all /c-subsets of the 
set X. 
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2 Multistage Robust Set Cover 



In this section, we give an algorithm for multistage robust set cover with approximation ratio 0(log m+ 
logn); this approximation matches the previous best approximation guarantee for two-stage robust 
set cover [6]. Moreover, our algorithm has the advantage of picking sets only in two stages. (In 
Section B.l, we show that an optimal strategy might need to pick sets in all stages.) 

The multistage robust set cover problem is specified by a set-system {U,T) with \U\ = n, set costs 
c : T — > M+, a time horizon T, integer values n = ko > k\ > ki > • • • > kx, and inflation pa- 
rameters 1 = Ao < Ai < A2 < • • • < Xt- Define Aq = U, and ko = \U\. A scenario-sequence 
A = (Ao, Ai,A%, . . . , At) is a sequence of T + 1 'scenarios' such that = ki for each i € [T]. Here 
Ai is the information revealed to the algorithm on day i. The elements in Di<jAi are referred to as 
being active on day j. 

• On day 0, all elements are deemed active and any set S £ J- may be chosen at the cost c(S). 

• On each day j > 1, the set Aj with kj elements is revealed to the algorithm, and the active 
elements are r\<j^4j. The algorithm can now pick any sets, where the cost of picking set S G T 
is Aj • c(S). 

Feasibility requires that all the sets picked over all days j £ [T] cover Di^T^-i, the elements that are 
still active at the end. The goal is to minimize the worst-case cost incurred by the algorithm, the 
worst-case taken over all possible scenario sequences. Let Opt be this worst-case cost for the best 
possible algorithm; we will formalize this soon. The main theorem of this section is the following: 

Theorem 2.1 There is an 0(logm + log n)- approximation algorithm for the T -stage k-robust set 
cover problem. 

The algorithm is easy to state: For any element e G U, let MinSet(e) denote the minimum cost set 
in T that contains e. Define r := /3 ■ maxj g m -^j- where j3 := 36 lnm is some parameter. Let 
j* = argmin je r r i(Aj kj). Define the "net" N := {e £ U \ c(MinSet(e)) > r}. Our algorithm's strategy 
is the following: 

On day zero, choose sets 4>q := Greedy-Set-Cover (iV). 

On day j* , for any yet-uncovered elements e in Aj*, 
pick a min-cost set in T covering e. 

On all other days, do nothing. 

It is clear that this is a feasible strategy; indeed, all elements that are still active on day j* are covered 
on that day. (In fact, it would have sufficed to just cover all the elements in (~)i<j*Ai.) Note that 
this strategy pays nothing on days other than and j* ; we now bound the cost incurred on these two 
days. 

Claim 2.2 For any scenario- sequence A, the cost on day j* is at most (3 • Opt. 

Proof: The sets chosen in day j* on sequence A are {MinSet(e) | e G Aj* \ N}, which costs us 

Aj* J2eeA *\n c(MinSet(e)) < Xj* \Aj* \ ■ r = Aj* kj*r = (3 ■ Opt. 
The first inequality is by the choice of N, the last equality is by r's definition. ■ 
Lemma 2.3 The cost of covering the net N on day zero is at most O(logn) • Opt. 
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The proof of Lemma 2.3 will occupy the rest of this section; before we do that, note that Claim 2.2 
and Lemma 2.3 complete the proof for Theorem 2.1. Note that while the definition of the set N 
requires us to know Opt, we can just run over polynomially many guesses for Opt and choose the one 
that minimizes the cost for day zero plus r • kj*Aj* (see [6] for a rigorous argument). 

The proof will show that the fractional cost of covering the elements in the net N is at most Opt, 
and then invoke the integrality gap for the set covering LP. For the fractional cost, the proof is via a 
careful backwards induction on the number of stages, showing that if we mimic the optimal strategy 
for the first j — 1 steps, then the fractional cost of covering the remaining active net elements at stage 
j is related to a portion of the optimal value as well. This is easy to prove for the stage T, and the 
claim for stage exactly bounds the cost of fractionally covering the net. To write down the precise 
induction, we next give some notation and formally define what a strategy is (which will be used in 
the subsequent sections for the other problems as well), and then proceed with the proof. 

Formalizing What a Strategy Means For any collection Q C T of sets, let Cov(^) C U denote 
the elements covered by the sets in Q, and let c{Q) denote the sum of costs of sets in Q. At any day i, 
the state of the system is given by the subsequence (Aq, A\, . . . , Aj) seen thus far. Given any scenario 
sequence A and % € [T], we define Aj = (Aq,A\, . . . ,Ai) to be the partial scenario sequence for days 
through i. 

A solution is a strategy <!?, given by a sequence of maps (</>o, 4>i, ■ ■ . ,<Pt), where each one of these 
maps fa maps the state Aj on day i to a collection of sets that are picked on that day. For any 
scenario-sequence A = (A±, A2, ■ ■ ■ , At), the strategy $ does the following: 

• On day 0, when all elements in U are active, the sets in 4>q are chosen, and Q\ <— 4>q. 

• At the beginning of day % G {l,--- , T}, sets in Qi have already been chosen; moreover, the 
elements in Cij<iAj are the active ones. Now, sets in <^(Aj) are chosen, and hence we set 

<-&U&(Ai). 

The solution <I> = (4>i)i is feasible if for every scenario-sequence A = (A±,A2, ■ ■ ■ , At), the collection 
Gt+i of sets chosen at the end of day T covers n^T^i, i.e. Cov(C/ T+1 ) 5 ^i<rA- The cost of this 
strategy $ on a fixed sequence A is the total effective cost of sets picked: 

C($ I A) = c(0 o ) + Zj=i Ai • c (<fc(A0) . 

The objective in the robust multistage problem is to minimize RobCov($), the effective cost under 
the worst case scenario-sequence, namely: 

RobCov(<I>) := max A C($ | A) 

The goal is to find a strategy with least cost; for the rest of the section, fix <!>* = {(j)*} to be such a 
strategy, and let Opt = RobCov($*) denote the optimal objective value. 

Completing Proof of Lemma 2.3. First, we assume that the inflation factors satisfy Aj+i > 12 • Aj 
for all j > 0. If the instance does not have this property, we can achieve this by merging consecutive 
days having comparable inflations, and lose a factor of 12 in the approximation ratio. The choice of 
constant 12 comes from a lemma from [6]. 

Lemma 2.4 ([6]) Consider any instance of set cover; let B € R + and k G Z + be values such that 

• the set of minimum cost covering any element costs > 36 mm • |r, and 

• the minimum cost of fractionally covering any k -subset of elements < B. 

Then the minimum cost of fractionally covering all elements is at most r • B, for a value r < 12. 
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For a partial scenario sequence Aj on days upto i, we use 4>*(Ai) to denote the sets chosen on day 
i by the optimal strategy, and 0<j(Aj) to denote the sets chosen on days {0,1, ... again by the 
optimal strategy. 

Definition 2.5 For any j £ [T] and Aj = (A\, . . . , Aj), define 

T 

VjiAj) := max TV"' ■ c(#(Ai)) • 

(A J+ i,-- ,A T ) 
|A t |=fctVt l_J 

That is, Vj(Aj) is the worst-case cost incurred by <£* on days {j, . . . , T} conditioned on Aj, under 
modified inflation factors r l ~i for each day i G {j, . . . ,T}. We use this definition with r being the 
constant from Lemma 2.4. Recall that we assumed that Aj > r\ 

Fact 2.6 The function Vq(-) takes the empty sequence as its argument, and returns Vq = max^ Yld=o r% ' 
c(<fi(Ai)) < hmxaEL) X i • C (^( A 0) = Opt, 

For any subset U' C J7 and any collection of sets ^ C J 7 , define LP(C/' | ^) as the minimum cost 
of fractionally covering U', given all the sets in Q at zero cost. Given any sequence A, it will also 
be useful to define Aj = Hi<jAj as the active elements on day j. Our main technical lemma is the 
following: 

Lemma 2.7 For any j 6 [T] and partial scenario sequence Aj, we have: 

LP (jV n Aj I (f><j-i (Aj_i)^ < Vj(Aj). 

In other words, the fractional cost of covering N n A; L (the "net" still active in stage j) given sets 
0<j_i (Aj-i) for free is at most Vj(Aj). 

Before we prove this, note that for j = 0, the lemma implies that LP(N) < Vq < Opt. Since the 
integrality gap for the set cover LP is at most H n (as witnessed by the greedy algorithm), this implies 
that the cost on day is at most O(logn)0pt, which proves Lemma 2.3. 

Proof: We induct on j G {0, • • • , T} with j = T as base case. In this case, we have a complete 
scenario-sequence A^ = A, and the feasibility of the optimal strategy implies that </>< t (At) completely 
covers At- So, 

LP (1 T I ^(At-i)) < c(#(A T )) = V T (A T ). 

For the induction step, suppose now j < T, and assume the lemma for j + 1. Here's the roadmap 
for the proof: we want to bound the fractional cost to cover elements in TV n Aj \ Cov(</><j_ 1 (Aj_i)) 
since the sets 0< -_ 1 (Aj_i) are free. Some of these elements are covered by (f)*(Aj), and we want to 
calculate the cost of the others — for these we'll use the inductive hypothesis. So given the scenarios 
A\ , . . . , Aj until day j, define 

W,-(A,-):= max V j+1 (A U . . . , Aj, B j+1 ) => Vj(Aj) = c(^(Aj)) + r ■ Wj(Aj). (2.2) 
Let us now prove two simple subclaims. 
Claim 2.8 Wj(Aj) < Opt/X j+1 . 
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Proof: Suppose that Wj(Aj) is denned by the sequence (Aj + ±, . . . , At); i.e. Wj(Aj) = YlJ=j+i r% J 1 ' 
c ((/>*(Aj)). Then, considering the scenario-sequence A = (Ai, . . . , Aj,Aj + \, . . . , Ay), we have: 

T T T 

0pt>J2^-c(ct>*(K))> £ Vc(#(A<))> ]T A j+1 r^- 1 .c(0*(AO) = A, +1 .W j (A j ). 

The third inequality uses the assumption that A^ + i > r • for all days I. < 
Claim 2.9 For any Aj + \ with \Aj + ±\ = kj + \, we have 

LP (iVnl j+1 |^.(A,)) <Wj(Aj). 

Proof: By the induction hypothesis for j + 1, and Vj + i(Aj + i) < Wj(Aj). < 
Now we are ready to apply Lemma 2.4 to complete the proof of the inductive step. 

Claim 2.10 Consider the set-system Q with elements N' := N f] (^Aj \ Cov(0< J (A : ,))^ and the sets 
J 7 \ (fr<j(Aj). The fractional cost of covering N' is at most r • Wj(Aj). 

Proof: In order to use Lemma 2.4 on this set system, let us verify the two conditions: 

1. Since N' C N, the cost of the cheapest set covering any e G N' is at least r > /3 ■ x + ^ t . +1 > 
(3 ■ using the definition of the threshold r, and Claim 2.8; recall j3 = 361nm. 

2. For every X C N' with \X\ < fcj+i, the minimum cost to fractionally cover X in Q is at most 
Wj(Aj). To see this, augment X arbitrarily to form Aj + \ of size fcj+i; now Claim 2.9 applied to 

Aj + \ implies that the fractional covering cost for N n Aj + i = N D (^Aj + i n in £ is at most 

Wj(Aj); since 1 C AT' C Af n ij and X C Aj + \ the covering cost for X in £ is also at most 
WjiAj). 

We now apply Lemma 2.4 on set-system Q with parameters B := Wj(Aj) and k = kj + i to infer that 
the minimum cost to fractionally cover N' using sets from T \ <fi<j (Aj ) is at most r • Wj (Aj ) . < 

To fractionally cover JVn Aj, we can use the fractional solution promised by Claim 2.10, and integrally 
add the sets (j)*(Aj). This implies that 

LP (N n Aj I ^(Aj^)) < c(cj>*(Aj)) + r ■ Wj(Aj) = Vj(Aj), 
where the last equality follows from (2.2). This completes the induction and proves Lemma 2.7. ■ 

3 Multistage Robust Minimum Cut 

We now turn to the multistage robust min-cut problem, and show: 

Theorem 3.1 There is an O (min{T, log n, log X max })- approximation algorithm for T -stage k-robust 
minimum cut. 
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In this section we prove an 0(T)-approximation ratio where T is the number of stages; in Appendix A 
we show that simple scaling arguments can be used to ensure T is at most minjlogn, logA maa; }, 
yielding Theorem 3.1. Unlike set cover, the guarantee here depends on the number of stages. Here is 
the high-level reason for this additional loss: in each stage of an optimal strategy for set cover, any 
element was either completely covered or left completely uncovered — there was no partial coverage. 
However in min-cut, the optimal strategy could keep whittling away at the cut for a node in each 
stage. The main idea to deal with this is to use a stage-dependent definition of "net" in the inductive 
proof (see Lemma 3.4 for more detail), which in turn results in an 0(T) loss. 

The input consists of an undirected graph G = (U, E) with edge-costs c : E — > M + and root p. For 
any subset U' C U and subgraph H of G, we denote by MinCutj^C/ 7 ) the minimum cost of a cut 
separating U' from p in H. If no graph is specified then it is relative to the original graph G. Recall 
that a scenario sequence A = (Aq, A±, . . . , At) where each Ai C U and \ Ai\ = ki, and we denote the 
partial scenario sequence (Aq, A±, . . . , Aj) by Aj. 

We will use notation developed in Section 2. Let the optimal strategy be <I>* = {(p*}J =0 , where 
now 4>j(Aj) maps to a set of edges in G to be cut in stage j. The feasibility constraint is that 
4>< t (At) separates the vertices in rij<T^4i from the root p. Let the cost of the optimal solution be 
Opt = RobCov($*). 

Again, the algorithm depends on showing a near-optimal two-stage strategy: define r := /3-maXj- e m yqr 
where /3 = 50. Let j* = argmin j6[T] (\j kj). Let the "net" N := {v € U \ MinCut(v) > 2T ■ r}. The' 
algorithm is: 

On day 0, delete <pQ := MinCut(A^) to separate the "net" N from p. 

On day j* , for each vertex u in Aj* \ N, delete a minimum u-p cut in G. 

On all other days, do nothing. 

Again, it is clear that this strategy is feasible: all vertices in rij<T^4j are either separated from the root 
on day 0, or on day j*. Moreover, the effective cost of the cut on day j* is at most \j* ■ 2Tt ■ \ Aj* \ = 
2/3TOpt = 0(T) ■ Opt. Hence it suffices to show the following: 

Lemma 3.2 The min-cut separating N from the root p costs at most 0(T) ■ Opt. 

Again, the proof is via a careful induction on the stages. Loosely speaking, our induction is based on 
the following: amongst scenario sequences A containing any fixed "net" vertex v 6 N (i.e. v £ rij<T^4i) 
the optimal strategy must reduce the min-cut of v (in an average sense) by a factor 1 /T in some stage. 

The proof of Lemma 3.2 again depends on a structural lemma proved in [6]: 

Lemma 3.3 ([6]) Consider any instance of minimum cut in an undirected graph with root p and 
terminals X; let B G M+ and k G Z + be values such that 

• the minimum cost cut separating p and x costs > 10 ■ ^, for every x € X . 

• the minimum cost cut separating p and L is < B, for every L £ ( k J. 

Then the minimum cost cut separating p and all terminals X is at most r • B, for a value r < 10. 
In this section, we assume \j+\ > 10-Aj for all j G [T]. Recall the quantity Vj(Aj) from Definition 2.5: 



r 




{A j+1 ,-,A T )<f 
\A t \=ktVt 



max 
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where r := 10 from Lemma 3.3. Since Aj > r l , it follows that Vq < Opt. The next lemma is now 
the backwards induction proof that relates the cost of cutting subsets of the net N to the VjS. This 
finally bounds the cost of separating the entire net TV" from p in terms of Vq < Opt. Given any Aj, 
recall that Aj = (~)i<jAi. 

Lemma 3.4 For any j G [T] and partial scenario sequence Aj, 

• if H := G \ 0< •_ 1 (A_j_i) (the residual graph in OPT's run at the beginning of stage j), and 

• Nj := {v G Aj | MinCut ff (w) > (2T - j) • r} (the "net" elements) 
then M\nCut H (Nj) < 5T • Vj(Aj). 

Before we prove the lemma, note that when we set j = the lemma claims that in G, the min-cut 
separating iVo = N from p costs at most 5T • Vq < 0(T) Opt, which proves Lemma 3.2. Hence it 
suffices to prove Lemma 3.4. Note the difference from the induction used for set-cover: the thresholds 
used to define nets is non-uniform over the stages. 

Proof: We induct on j G {0, • • • , T}. The base case is j = T, where we have a complete scenario- 
sequence At: by feasibility of the optimum, <f>* <T cuts At 5 Nt from r in G. Thus the min-cut for 
N T in G \ 0^ T _ 1 (A T _i) costs at most c(0^(A T ~)) = y T (A T ) < 5T • V T (A T ). 

Now assuming the inductive claim for j + 1 < T, we prove it for j. Let H = G \ 0< •_ 1 (Aj_i) be the 
residual graph after j — 1 stages, and let H' = H \ <p*(Aj) the residual graph after j stages. Let us 
divide up Nj into two parts, Nj := {v £ Nj \ MinCutH'(f) > (2T - j - 1) ■ r} and Nj = Nj \ Nj, and 
bound the mincut of the two parts in H' separately. 

Claim 3.5 MinCut H ,(iVj) < 4T • c{cp*{Aj)). 

Proof: Note that the set Nj consists of the points that have "high" mincut in the graph H after 
j — I stages, but have "low" mincut in the graph H' after j stages. For these we use a Gomory-Hu 
tree-based argument like that in [5]. Formally, let t := (2T — j) ■ r < 2Tr. Hence for every u 6 Nj, 
we have: 

MinCuttf(u) > t and MinCut^(u) < (l - ±) t. (3.3) 

Consider the Gomory-Hu (cut-equivalent) tree T(H') on graph H' , and root it at p. For each vertex 
u G N?, let (X u , X u ) denote the minimum p-u cut in T(H'), where u G X u and p X u . Pick a subset 
N' C Nj such that the union of their respective min-cuts in T(H') separate all of Nj from p and 
their corresponding sets X u are disjoint — the set of cuts in tree T(H') closest to the root p gives such 
a collection. Define F := U ug jv'<9#'(X u ); this is a feasible cut in H' separating Nj from p. 

Note that (3.3) implies that for all u G Nj (and hence for all u G N'), we have 

(i) c(dH'(X u )) < (1 — 27O • t since X u is a minimum p-u cut in if', and 

(ii) c(dH{X u )) > t since it is a feasible p-u cut in H. 

Thus c(d H \ H ,(X u )) = c(d H (X u )) - c(d H ,(X u )) >^t>±- c{d H ,(X u )). So 

c{d H i{X u )) < 2T ■ c(d H \ H ,{X u )) for all u G N' , (3.4) 

Consequently, 

c(F) < EueN' <d H ,{X u )) < 2T • J2 ueN , c(d H \ H ,(X u )) < AT • c(H \ H') = AT • c(^(A,-)). 

The first inequality follows from subadditivity, the second from (3.4), the third uses disjointness of 
{X u } u€N ,, and the equality follows from H\H' = <p*{Aj). Thus M\nCut H ,(Nj) < AT ■ c(<p*(Aj)). < 
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Now to bound the cost of separating Nj from p. Recall the quantity Wj(Aj) from (2.2), 



and that Vj(Aj) = 4>*(Aj) + r • Wj(Aj). 
Claim 3.6 MinCut if /(A^) < 5rT-Wj(Aj). 

Proof: The definition of Nj implies that for each u € A^ 1 we have: 

MinCut^(u) > (2T-j-l)r > Tr > T ■ B °^ > /?T ^ (Ai) , (3.5) 

where the last inequality is by Claim 2.8. Furthermore, for any fcj+i-subset L C Nj C A,- we have: 

MinCut H /(L) < 5T-V j+1 (A 1 ,--- ,Aj,L) < 5T-W j (A j ). (3.6) 

The first inequality is by applying the induction hypothesis to (Ai,--- ,Aj,L); induction can be 
applied since L is a "net" for this partial scenario sequence (recall L C Nj and the definition of Nj). 
The second inequality is by definition of Wj (Aj ) . 

Now we apply Lemma 3.3 on graph H' with terminals X = A^, bound B = 5T •Wj(Aj), and k = kj + ±. 
Since B = 50, equations (3.5)-(3.6) imply that the conditions in Lemma 3.3 are satisfied, and we get 
MinCut^iVj) < 5rT • Wj(Aj) to prove Claim 3.6. < 

Finally, 

MinCut // (A r i) < M\nCut H ,(Nj) + MinCut if /(AT|) + c(0*(A j )) 

< 5rT ■W j (A j ) + 4:Tc^*(A j )) + c(<f>*(A j )) < 5T-Vj(Aj). 

The first inequality uses subadditivity of the cut function, the second uses Claims 3.5 and 3.6, and 
the third uses T > 1 and definition of Wj(Aj). This completes the proof of the inductive step, and 
hence of Lemma 3.4. ■ 



4 Multistage Robust Steiner Tree 

We now turn to the multistage robust Steiner tree problem. 

Theorem 4.1 There is an O (min{T, logn, log \ max })- approximation algorithm for T-stage k-robust 
Steiner tree. 



Again, we prove an 0(T)-approximation ratio where T is the number of stages; Theorem 4.1 then 
follows from the scaling arguments in Appendix A. This is the first of the problems we consider where 
the net creation is not "parallel": whereas in the two previous problems, we set the net to be all 
elements that were heavy in some formal sense, here we will pick as our net a set of elements that are 
mutually far from each other. 

The input consists of an undirected edge- weighted graph G = (U,E). For any edge- weighted graph 
G' = (V , E') and u,v € U' , let do'(u,v) denote the shortest-path distance between u and v in G'; 
if no graph is specified then the distance is relative to the original G. Given a graph G 1 = (U',E') 
and a set of edges E" C E' , the graph G/E" is defined by resetting the edges in E" to have zero 
length. For any graph G 1 and subset X of vertices, MinStc^X) is the minimum cost of a Steiner tree 
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on X. Recall the definition of a scenario sequence A = (Aq,Ai, . . . ,At), partial scenario sequence 
Aj = (Aq, Ai, . . . , Aj) and notation from Section 2. The optimal strategy is <£* = {4>*}j =0 , where 
(f>*(Aj) maps to a set of edges in G to be chosen in stage j. The feasibility constraint is that (/>< t (At) 
connect the vertices in rij<T^4j to each other. The optimal cost is Opt = RobCov($*). 

Define r := /3 • max^p] for f3 = 10; and j* = argmin jG [ r ] (Aj kj). Let N be a maximal subset of 
U such that d(u, v) > AT ■ r for all u / v, u,v G N. The algorithm is: 

On day 0, buy edges 4>o '■= MSTc^iV) i.e. a 2-approximate Steiner tree. 

On day j* , for each u G Aj*, buy a shortest path from u to N in the residual graph G/4>q. 

On all other days, do nothing. 

Again, the cost incurred on day j* is not high: by the maximality of N, the distance from any u G Aj* 
to N is at most 4Tr, and hence the total effective cost incurred is 4Tr • \Aj* \ ■ Xj* = ATf3 • Opt = 
0{T) Opt. Thus it is enough to prove the following: 

Lemma 4.2 The MST on the set N costs at most 0(T) ■ Opt. 

Let us define some useful notation, and review some concepts from previous sections. For any v G U, 
distance 5 G M+ and graph G', let the ball Bq/(v, 5) denote the set of vertices that are at most distance 
5 from v. Given some value t > 0, a set iV C U of vertices is called a t-ball-packing in graph G' if the 
balls {Bq'(u, t)} ueN are disjoint. Finally, recall Vj(Aj) as defined in Definition 2.5; here we set r = 1. 
Given A, recall that Aj = C\i<jAi. 

The main structure lemma that will prove Lemma 4.2 is the following: 

Lemma 4.3 For any j G [T] and partial scenario-sequence Aj, if the residual graph H := •_ 1 (Aj_i) 

and if Nj C Aj is any (2T — j)r -ball-packing in H (i.e., the "net" elements) then the cost of the 
minimum Steiner tree MinStfl-(iVj) < 5T • Vj(Aj). 

Before we prove Lemma 4.3, note that setting j = 0, the lemma says that the weight of the minimum 
Steiner tree in G connecting up elements of iV (which is a 2Tr-ball-packing) costs at most 5T Vq < 
5TOpt, which proves Lemma 4.2. One difference from the proof strategy used earlier (set cover and 
min-cut) is that we cannot directly rely on structural properties from the two-stage problem [6], since 
this yields only a guarantee exponential in T. Instead we give a different self-contained proof of the 
inductive step to obtain an 0{T) approximation ratio; this also yields an alternate proof of constant 
approximation for 2-stage robust Steiner tree. 

Now, back to the proof of the lemma. 

Proof: Let n = (2T — i)r for any i G [T]. In the base case j = T, we given the complete 
scenario sequence A. The feasibility of the optimal solution implies that the edges in 0^ T (A) connect 

up At, and hence Nt- Consequently, the cost to connect up Nt in G / (j)* <T _ l (At-i) is at most 
c(<t> T (A T )) = V T {A T ) < 5T Vt(At), since T > 1. 

For the inductive step, assume that the lemma holds for j + 1, and let us prove it for j. Let H = 
G/^>< ■_ 1 (A_ 7 -_i), and H' = H/(f>*(Aj) = G/cjf < j(Aj). Note that by assumption, Nj is a Tj-ball-packing 
in graph H, so the balls {Bh(u, Tj) \ u G Nj} are disjoint. 

Define A^ 1 := {v G Nj \ B^/(f,Tj + i) C Bjj(v,Tj)} be the set of points in Nj such that their r^+i-balls 
in H' are contained within their Tj-balls in H. (Recall that Tj + \ < Tj, and hence this can indeed 
happen.) Let iVj = Nj \ Nj be the remaining points in Nj. Let us prove some useful facts about 
these two sets. 
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Claim 4.4 \Nf\r j+1 < 2T • c(<pj(Aj). 

Proof: For any v G Nj, it holds that B#/(t;,Tj+i) % Bh{v,t 3 ). So there is some vertex w G 
BH'(v,Tj+i) \ Bn(v,Tj); this means that cIh{v,w) > Tj but oIh'(v,w) < Tj+i; the distance to w has 
shrunk by at least Tj — Tj+i = r. Since H' = H/cf)*(Aj), this implies that the edges bought by 4>*{Aj) 
just within this ball is large. In particular, if E(Bn(v,Tj)) denotes the edges induced on Bh(v, Tj) 
then: 

c (j>*(Aj) p| E(B h (v, Tj))") > Tj - Tj+i = t, which holds for all v G Nj. 

Since the edge-sets E(Bjj(v, Tj)) for v G Nj are disjoint, we can sum over all vertices in Nj to get 
c((f)*(Aj)) > \Nj\ ■ t. Finally, T j+1 < 2T • r, so \Nj\ T j+1 < 2T ■ c(0*(A j )) which proves the claim. < 

Claim 4.5 The set Nj forms a Tj + \ -ball-packing in H' . 

Proof: Assume not. Then the Tj+i-balls in H' around some two points u, v G must intersect. 
But these balls are contained within the Tj-balls around them in H, by the definition of Nj, so 
Bh(u, Tj) n Bn(v,Tj) / 0. But this contradicts the fact that u,v G Nj, and Nj was a Tj-ball-packing 
in H. < 

Recall the definition of Wj(Aj) as in (2.2); note that we set r = 1. 

Claim 4.6 For any Tj + \-ball-packing Z C TVj in H' , its size \Z\ < kj + \ — 1. Moreover, MinSt///(Z) < 

Proof: For a contradiction, suppose |Z| > kj + i, and let ^4j+i C Z denote any kj + ±-set. Observe 
that (Ai, ■ ■ ■ ,Aj,Aj + \) is a valid partial scenario-sequence. Also, Aj + i C ^ C Nj C Aj, and hence 
Aj'+i — n i<j+i^j = ^j+i- Furthermore, ^4j+i C Z is a Tj + i-ball-packing in Thus applying the 
induction hypothesis for j + 1 on (A\, ■ ■ ■ ,Aj,Aj + \), we get that MinStj//(A, + i) < 5T • Vj + i(Aj + i). 
Now, 

5T.^- +1 (A i+1 ) < 5T-Wj(Aj) < 5T-°^ < 5 ^-.k j+1 T < Tk l±^. 

The first inequality is by (2.2) which defines Wj(Aj), the second by Claim 2.8, the third by the 
definition of r, and the last inequality uses (3 = 10. On the other hand, Aj + \ is a Tj+i-ball-packing in 
H' and so MinStflv (A/+i) > (|A, + i| — 1)tj+i = (A; J+ i — 1) • r_j+i > kj + ±T t /2, since we may assume 
> 2 (otherwise fc^ < 1 and the optimal value is 0). This contradicts the above bound. Thus we 
must have \Z\ < kj + \. 

Now augment Z with any — \Z\ elements to obtain Aj + ±, and apply the inductive hypothesis again 
on the scenario-sequence (A\, ■ ■ ■ , Aj,Aj + \) and the Tj+i -ball-packing in H', Z C Aj + \ to obtain 

MinSt H /(Z) < 5T-^ + i(A j+ i) < 5T-Wj(Aj). (4.7) 

This completes the proof of the claim. -4 

Construct a maximal Tj+i-ball-packing Z C Nj in the graph H' as follows: add all of Nj to and 
greedily add vertices from Nj to Z until no more can be added without violating the ball-packing 
condition. Now, to prove the inductive step, we need to show how to connect up the set Nj cheaply 
in the graph H. We first bound the cost of this Steiner tree in H' . Claim 4.6 says we can connect 
up Z in the graph H' at cost 5TWj(Aj). Since Z is a maximal Tj + i-ball-packing inside Nj, we know 
that each element in Nj \ Z C TV? is at distance at most 2x^+1 from some element of Z. There are 
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only 2Tc(4>*(Aj))/Tj + \ elements in iVj by Claim 4.4, so the total cost to connect these points to Z is 
at most their product 4Tc(4>*(Aj)), giving us that 

M\nSt H , (Nj) < 5T • Wj(Aj) + AT • c(0*(A j )) 

Finally, since the length of the Steiner-tree(A^) in H can only be greater by (p*(Aj), we get that 

MinSttf (iVj) < 5T • Wj(Aj) + 5T • c(</>*(A j )) = 5T • Vj(Aj), 

which completes the proof of the lemma. ■ 

5 Multistage Robust Steiner Forest 

Here we consider the multistage robust version of the Steiner Forest problem: 

Theorem 5.1 There is an O (min{T, log n, log X ma x}) -approximation algorithm for T -stage k-robust 
Steiner forest. 

Again, we prove an 0(T)-approximation ratio where T is the number of stages; Theorem 5.1 then 
follows from the scaling arguments in Appendix A. Here, both the algorithm and the proof will be 
slightly more involved than the previous problems. Recall that the input for Steiner Forest consists 
of an undirected edge-weighted graph G, and pairs {si,U}i € p. For any graph H and subset P'CP 
of pairs, we let MinSF^(P') denote the minimum cost of a Steiner forest connecting pairs in P'; again 
if the graph is not specified it is relative to the input graph G. 

For A > 0, we define a subset iVCPof pairs to be a A-SFnet in graph G' if 

• d,Qi{si, ti) > A for all i G N, and 

• there exist Zi £ {si,ti} for all i £ N such that the balls {B(j/(zj, A)} ieN are disjoint. 
The following simple property immediately follows from the definition: 

Lemma 5.2 If N C P is a A-SFnet in graph G' then MinSF G /(A0 > \N\ ■ A. 

Proof: Consider the dual of the Steiner forest LP relaxation, which is a packing problem. Each ball 
of radius at most A around any vertex in {zj}i 6 Ar is a feasible variable in the dual, since da'(si,ti) > A 
for all i G N. Now since {Bc>(zi, A)} i&N are disjoint, there is a feasible dual solution of value \N\ ■ A. 
Hence the optimal Steiner forest costs at least as much. ■ 

However, it seems quite non-trivial to even compute a maximal A-SFnet. This is unlike the previous 
algorithms (set cover, min-cut, Steiner tree) where computing this net was straightforward. Instead, 
we will run a specialized procedure to compute a near-maximal net which suffices to give an algorithm 
for multistage Steiner forest. 

Now to describe the algorithm. Define j3 := 10, r := j3 ■ max^^ and j* = argmin^^ (Xj kj). 
We now run Algorithm 5.1 below to find a 7 = 2Tr-SFnet N C P, as well as a set of edges E a \ g C E. 
Then, 

On day 0, buy the edges in (fro := E a \ g . 

On day j*, for each (s,t) £ Aj*, buy a shortest (s,t)-path in the residual graph G/cfro. 
On all other days, do nothing. 
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Algorithm 5.1 Algorithm for near-maximal 7-SFnet for Steiner forest 
1: let S r ,S g ,So,S b ,Sf,W <^Q. 

2: while there exists a pair i G P with dQ/(s r uSf)( s i> U) > 4 7 do 

3: let S r <r- S r U {i} 

4: if d G (si,w) < 27 for some wGW then S 1 / <- S/ U {(si,w)} else W «- IF" U {si} 
5: if d G (U,w') < 2 7 for some w' £ W then S 1 / <- S 1 / U «/)} else W <- W U {U} 
6: let Si G {0, 1, 2} be the increase in |W| due to steps (4) and (5). 

7: if (<5i = 0) then S h <- S b U {i}; if (<5, = 1) then S Q <- S D U {i}; if = 2) then S g <- S g U {z}. 
8: let E'aig := 2-approximate Steiner forest on pairs S r , along with shortest-paths connecting every 
pair in Sf. 

9: output the 7-SFnet iV := S g \JS , and edge set E a \ g . 



Set 7 := 2Tt. In Algorithm 5.1, G/(S r U Sf) denotes the graph obtained from G by identifying all 
pairs in S r and Sf. We note that this algorithm is essentially same as the one for 2-stage robust 
Steiner forest in [6]; however we need to rephrase slightly in order to fit it into our context. 

To bound the cost incurred on day j* , we note that the algorithm adds pairs in P whose distance in 
the residual graph G/E a \ g is more than 47 = 8Tr. Hence the cost of connecting any pair in P \ N 
is at most 8TY, and consequently the total effective cost incurred on day j* is 8Tr • \Aj*\ ■ Aj* = 
8T/3 • Opt = 0(T) Opt. Thus it is enough to prove the following lemma, whose proof we present in the 
next section: 

Lemma 5.3 The cost of the edges E a i g is at most 144TOpt. 

The proof bounding the cost of edges in E a \ g is also more complicated that in previous sections; here's 
the roadmap. First, we use the (by now familiar) reverse inductive proof to bound the cost of edges 
in an optimal Steiner Forest connecting the pairs in the net N. However, our set of edges -E^ig is not 
just an approximate Steiner Forest for the pairs in N, but also connects other nodes, and we have to 
also bound the cost of the extra edges our algorithm buys. 

For the rest of the section, we use 7 := 2Tr. Recall that Aj = Hi<jAj. Also recall Vj(Aj) from 
Definition 2.5; here we use r = 2. Recall that a set S C U in a graph G 1 = (U,E') is called a 
t -ball-packing if the balls {BQ/(x,t)}t£S are disjoint. Hence, the second condition in the definition of 
a set ./V being a A-SFnet is that for each {si,U} G N, there exists Zi £ {si,ti} such that the set of 
these ZiS is a A-ball-packing. If is a SFnet, let CO^O denote this ball-packing that witnesses it. 

Lemma 5.4 For any j 6 [T] and partial scenario-sequence Aj, let the residual graph in stage j of the 
optimal algorithm be H := G/0< J _ 1 (Aj_i), and let Nj C Aj be any (2T — j)T-SFnet in H (i.e., the 
"net" elements). Then the cost of the optimal Steiner forest on Nj is MinSF/f (Nj) < 9T • Vj(Aj). 

Proof: Let Tj = (2T — i)r for any i £ [T]. In the base case j = T, we are given the complete 
scenario sequence A. The feasibility of the optimal solution implies that the edges in 0< T (A) connect 

the pairs At, and hence Nt- Consequently, the cost to connect up Nt in G / (t>* <T _ 1 (A t -i) is at most 
c(<f) T (A T )) = V T (A T ) < 9T • V T (A T ). 

Let Tj := {si,ti I i £ Nj} be all the terminals in Nj. Since Aj is a Tj-SFnet in graph H, C(Nj) 
is a Tj-ball-packing in H. Let graph H' := H/(j)*-(Aj). Similar to the proof for Steiner tree, define 
Nj := {i G Nj j B H ,(zi,T j+ i) C B H (zi,Tj)}, and let Nj = Nj \ Nj be the rest of the pairs. The fact 
that Nj is a Tj + i-SFnet in H' follows from the definition of Nj and Claim 4.5. We bound the costs 
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) Dotted lines denote edges of Q\ 

Large circles denote components C\ , . . . , C p of Q\ 
Solid lines are edges in Q 2 



Arrowed solid lines denote the directed forest T 



Figure 5.1: The auxiliary graphs Q\ and Qi. 



MinSF///(iV J 1 ) and MinSF#/(iV 2 ) separately by invoking the inductive hypothesis twice. Let Wj(Aj) 
be defined as in (2.2); note that Vj(Aj) = c((f)*(Aj) + 2Wj(Aj), since r = 2 in this case. 



The next two claims have proofs almost identical to the Steiner tree Claim 4.6, and Claim 4.4 respec- 
tively. 

Claim 5.5 The cardinality of Nj is \Nj\ < k j+1 , and M\nSF H ,(Nj) < 9T ■ Wj(Aj). 
Claim 5.6 We have \Nj\ r j+1 < 2T ■ c((p*(Aj)). 

Recall we need to show that the cost to connect up pairs in N is small; we now have that connecting 
up Nj is not expensive, and have a bound on the cardinality of Nj — it remains to show that this can 
be used to bound its cost. This we do in the following discussion culminating in Claim 5.9. 

Let 7~ 2 := {s^U | % G iV 2 } be the terminals corresponding to pairs in Nj. Consider an auxiliary graph 
Qi on these terminals, where edges connect nodes at distance at most 2Tj +i — i.e., V(Gi) = T~J , and 
E(Qi) = {(a, b) | a, b G 7~ 2 , dn'(a, b) < 2rj + ±}. Let Ci,...,C p denote the connected components in 
this graph Q±, so Tf = U^ =1 Cg. See also Figure 5.1. Let Ai denote the minimum length forest in the 
graph H' having the same connected components as Q\. 

Claim 5.7 The cost of the forest M (in graph H' ) is at most 8T ■ c(<fi*(Aj)). 

Proof: Two nodes in Q\ are connected only if their distance in H' is at most 2r, + i; moreover, the 
number of vertices in Q\ is \T~f \ = 2|iV 2 |. Hence, the forest M costs at most 4\Nj\rj+i, which is at 
most 8T c((f>*(Aj)) using Claim 5.6. < 

Define a : Tj — > {1, . . . ,p} mapping each terminal in 7~ 2 to the index for its component in Q\\ i.e., 
a(u) = £ if u G Cg C Q\. Define another auxiliary multigraph Qi on vertices {1,2, ... ,p} and edges 
E(Qi) := {(a(si), a(ti)) \ i G Nj}. (See Figure 5.1.) Observe that there is a one-to-one correspondence 
between edges in Qi and pairs in Nj . Let T denote the edges of any maximal forest in Qi {T cannot 
contain any self-loops); we also use T to denote the corresponding pairs from Nj. 

Claim 5.8 MinSF H '(.F) < 9T-Wj{Aj). 

Proof: Orient the edges in the forest T so that each vertex i G V(Qi) has at most one incoming 
edge; call this map tt : T — > V(Qi), and note that for each I e J, tt _1 (^) < 1. For each i G T , 
set z[ = Si if the edge is oriented towards Si, and ti otherwise. Clearly, the vertices {^}j g j- lie in 
distinct components of Q\. Hence {^}j G jr is an independent set in Q\, and hence is a Tj + i-ball-packing 
in H'. Moreover, since {si,t{} lie in distinct components of Q\, dH'{si,U) > 2r 3 -+i; hence T is also 
a Tj+i-SFnet in graph H' . Now, as in the proof of Claim 4.6, it follows by induction on T that 



MinSFH'(J") < 9T-Wj(Aj). 
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Claim 5.9 M\nSF H >(N?) < MinSF H #(.F) + c(M) < 9T • Wj(Aj) + 8T • c(^(Aj)). 

Proof: We will show that for every i G A"?, the subgraph .M |J MinSFj//(J r ) connects Sj to t« in graph 
H', after which the claim follows directly from Claims 5.7 and 5.8. 

Recall that M has the same connected components as graph Q\. For any % G Nj, let a = a(si) and 
b = a(ti) be the indices of the components containing s« and ti respectively. If a = b, then M connects 
{si,ti}, so assume a ^ b. Since J 7 is a maximal forest in Q2, it contains some path e±, . . . ,e q from 
a (a) to alp). But now MinSF^/(J 7 ) U M connects Sj to ij. 

Finally using sub-additivity of Steiner Forest, and Claims 5.5 and 5.9, 

MinSF H '(iVj) < MinSF ff /(iV ? 1 ) + MinSF H /(iV 2 ) < 2 • 9T • Wj(Aj) + 8T • c(0*(A j )). 

Thus M\nSF H (Nj) < c(0*(A j )) + MinSF H '(^') < 9T • [c(0*(Aj)) + 2 • W^A,)] = 9T • V$(Aj). ■ 

As a consequence of the case j = of Lemma 5.4 from the previous section, we know that the cost of 
the optimal Steiner forest on any to = 2Tr-SFnet Nq in the original graph G is at most 9T Vq < 9T Opt. 
And indeed, the set Af = S g U S is a 2Tr-SFnet in G, since {Bg(w, 2Tt) \ w £ W} are disjoint and 
W n {sj, tj} 7^ for all i £ S g U 5 . However, the set £^ a | g is not just the Steiner forest on S g U S Q , it 
is actually a Steiner forest on S r = S g U S U Sb, along with shortest paths between every "fake" pair 
in Sf. This is what we bound in the proof below. 

Proof of Lemma 5.3: Observe that c(£ , a i g ) < 2 • MinSF(SV) + 27 • |S/|, since the distance between 
each pair in Sf is at most 27. (Remember, 7 = 2TV.) Recall that N = S g U S and SV = A^ U S^,; 
so MinSF(S' r ) < MinSF(A r ) + MinSF(S'; ) ). Finally, the argument in the previous paragraph shows that 
MinSF G (A0 < 9TV < 9T0pt. Hence 

c(-Baig) < 2 • 9T0pt + 2 • MinSF(S* 6 ) + 2 7 • \S f \. (5.8) 

Moreover, Sb might be very far from a 7-SFnet, so we cannot just apply the same techniques to it. 

Bounding MinSF(S , f ) ). As in the proof of Lemma 5.4, define an auxiliary graph Q\ with vertices 
V(Qi) = {si,ti I i G Sb}, and edges E(Qi) = {(a, 6) | do{a, b) < 27} between any two terminals 
in V(Q±) that are at most 27 apart. Let C\, . . . , C p be the connected components in this graph Q±, 
and let A4 be the minimum cost spanning forest in G having the same connected components as Q\. 
Since there are 2\Sb\ vertices in Q\ and edges correspond to pairs at most 27 from each other, the cost 
c(M) <MS b \j. 

Again, define map a : V(Q±) — > {1, . . . ,p} where a(u) = I if u £ Cg. Define another auxiliary graph 
Q2 on vertices {1,2, . . . ,p} with edges EIQ2) := {(a(si), a(ti)) \ i G Sb}- Let T denote the edges of 
any maximal forest in Q2, and also to denote the corresponding pairs from Sb- By orienting T so that 
each vertex has indegree at most one, we obtain z[ G {sj,tj} for alH G T satisfying \C n {^jieJ 7 ! < 1 
for each component C of Q\ (see Claim 5.8 for details). Thus {zi}i e jr is an independent set in Q\, and 
so {Bc( z i, f) I i G F} are disjoint. This implies that J 7 is a 7-SFnet in G. Applying Lemma 5.4 on T 
(with j = 0) gives MinSF G (J") < 9T • Opt. 

Finally, as in Claim 5.9 we obtain MinSF(Sy < MinSFp 7 ) + c(M) < 9T ■ Opt + A\S b \ 7. Combining 
this with (5.8) we have: 

c(£ a i g ) < 36T • Opt + 8\S b \ 7 + 2|5/| 7- (5.9) 
Bounding \Sb\ and \Sf\. We use the following property of Algorithm 1. 
Claim 5.10 ([6]) In any execution of Algorithm 1, \S b \ < \N\ and \S f \ < 2\N\. 
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Proof: This follows directly from the analysis in [6]. Lemma 5.3 in that paper yields \Sf\ < \S r \. 
By the definition of various sets S g , S b , S Q , Sf and N = S g U S Q , we have that \Sf\ > 2\S b \ and 
|5 r | = \N\ + \S b \. Thus we have 2\S b \ < \S f \ < \S r \ < \N\ + \S b \, implying \S b \ < \N\. Finally, 
< \S r \ = \N\ + \S b \ < 2\N\. ' < 

Combining this with (5.9), it just remains to bound |AT| -y. Recall that N is a 7-SFnet in G; so 
7 |AT| < MinSF G (iV). On the other hand, we already argued that MinSF G (iV) < 9T0pt. Putting this 
together with (5.9), we get 

c(.E a ig) < 36T • Opt + 8\S b \ 7 + 2|5/| 7 < 36T • Opt + 12|iV| 7 < 144T • Opt. 

This completes the proof of Lemma 5.3. ■ 
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A Preprocessing to Bound Number of Stages T 

We show using a scaling argument how to ensure T < 0(min{log \ ma x, logn}) for the min-cut, Steiner 
tree and Steiner forest problems; here n is the number of vertices in the input graph). Recall that the 
approximation for multistage robust set cover does not depend on T, hence such a preprocessing step 
is not required for set cover. 

It is easy to ensure that X ma x = At > 2 T losing a constant factor in the objective: we simply find a 
maximal subsequence of stages i where the inflation Aj in each stage in the subsequence increases by 
at least a factor of two from the previous stage in the subsequence, and perform actions only on these 
stages (or in the very final stage T). This means T < log 2 X ma x- 
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We now ensure that T < O(logn). Guess the edge / of maximum cost (unsealed) that is ever bought 
by the optimal strategy on any scenario-sequence; there are at most n 2 choices for / and the algorithm 
will enumerate over these. It follows that Opt > c/ since / must be bought on some scenario-sequence, 
and all inflation factors are > 1. Let E^igh '■= {e G E : c e > Cf}; by the choice of /, no edge in E^igh is 
ever used by Opt for any scenario-sequence. So the optimal value does not change even if we remove 
edges E^igh from the instance — in Steiner tree/forest this means deleting edges E^g^ from the graph, 
and in min-cut this means contracting edges Ehigh- This yields an instance with maximum edge-cost 

Cmax — Cf- 

Let Ei ow := {e G E : c e < Cf/n 2 }; the total cost of edges in E[ ow is at most I-E/owjI ' ^4 < c/ < Opt 
by the choice of /. So we can assume that all edges in Ei ow are always bought in stage 0, at the loss 
of an additive Opt term in the objective. This ensures that the minimum edge-cost c m i n > Cf/n 2 . 
Combined with the above step, we have an instance with < n 2 . 

Observe that any stage i with > n 2 • is completely inactive in any optimal strategy: if not 
then the objective of the resulting strategy is greater than tl ' c max , whereas the trivial strategy of 
buying all elements in stage costs at most n 2 • c max - Hence, without loss of generality, we have 
\nax < n 2 ■ < f miL , which combined with the above gives X max < n A . Thus T < log 2 X max < O(logn). 

B Some Useful Examples 

B.l Non-Optimality of Two-Stage Strategies for Set Cover 

We give an instance of multistage set-cover where any optimal solution has to buy sets on all days. 
This shows that we really need to consider near-optimal strategies to prove our structure result about 
thrifty strategies. 

Let T denote the time horizon. The scenario bounds are ki = T + 1 — i for each < i < T. The 
inflation factors are Aj = (1 + e) 1 for each j 6 [T], where e > is chosen such that At < 2. The 
elements in the set-system are U := {1, • • • , T, T + 1}. The sets and their costs are as follows: 

• For each i G {1, ■ ■ ■ , T — 1}, define Si as the collection consisting of all ki = T + 1 — i subsets of 
{i, ■ ■ ■ , T + 1} except {i + 1, ■ ■ ■ , T + 1}; each of the sets in Si has cost At/Aj. Note that each 
set in Si contains i. 

• There are two singleton sets {T} and {T + 1} of cost one each; let St '■= {{T}, {T + 1}}. 
The number of sets is at most T 2 , and each costs at least one. 

Consider the strategy a that does nothing on day and for each day i G {1, 2, . . . , T} does: 

• If the scenario Ai 6 (^) in day i equals one of the sets in Si then buy set Ai on day i. Note 
that this set Ai in the set-system has cost Xt/K- 

• For any other scenario Ai £ (^), buy nothing on day i. 

It can be checked directly that this is indeed a feasible solution. Moreover, the effective cost under 
every scenario-sequence is exactly At < 2. Note that for every scenario-sequence, a buys sets on 
exactly one day; however the days corresponding to different scenarios are different. In fact, for each 
i G [T] there is a scenario-sequence (namely Aj = {j + 1, ■ ■ ■ , T + 1} for j < i and Ai £ Si) where a 
buys sets on day i. Thus strategy a buys sets on all days. 

We now claim that a is the unique optimal solution to this instance. For any other feasible strategy 
a', consider a scenario-sequence (Ai, • • • , At) where a' behaves differently from a. Let i G {0, ■ • ■ , T} 
denote the earliest day when a 1 differs from a under this scenario-sequence. There are two possibili- 
ties: 
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• Aj = {j + 1, ■ ■ ■ , T+ 1} for each j < i and Ai G 5j, i.e. cr buys nothing before day i and buys set 
Aj on day i. a 1 buys nothing before day i and covers A\ C on day i. From the construction 
of sets, it follows that i G" A^ (otherwise a' buys some set of cost > \t/\ and so we may assume 
A\ = Ai). Consider any scenario-sequence that equals A\, ■ ■ ■ , Ai until day i, and has {i} as 
the scenario on day T: the effective cost under a' is then at least Aj + i • \t/\ > At (since the 
min-cost set covering i costs Xt/K and faces inflation at least Aj + i) . 

• Aj = {j + 1, ■ ■ ■ , T + 1} for each j < i, i.e. a buys nothing until day i (inclusive), a' buys 
nothing before day i and covers A\ / on day i. We have the two cases: 

1. If A\ C Ai then let e G Aj \ A^. Consider any scenario-sequence that equals A±, ■ ■ ■ ,Ai 
until day i, and has {e} as the scenario on day T: the effective cost under a' is then at 
least 2 • Aj > Aj 1 (since a 1 buys at least two sets on days i or later). 

2. If A\ = Ai = {i + 1, . . . , T + 1} then the cost of a' under any scenario-sequence with 
Ai, • • • , Ai until day i is at least 2 • Aj > At (since the min set cover of {i + 1, . . . , T + 1} 
costs at least 2). 

In all cases above a' has objective value strictly larger than At- Thus a is the unique optimal solution. 

Remark: It is still possible that there is always a thrifty (i.e. two-stage) solution to multistage 
robust set-cover having cost within O(l) x Opt (the above example just shows that the constant is at 
least two). If such a result were true, then in conjunction with the two-stage result from [6] it would 
also yield Theorem 2.1. However, proving such an existence result seems no easier than obtaining an 
algorithm for the original multistage problem. Indeed, we take the latter approach in this paper and 
directly give thrifty approximation algorithms for all the multistage problems considered. 

B.2 Bad Example for Subset A>robust Uncertainty sets 

Consider the following slight generalization of the A;-robust uncertainty model: on each day i we are 
revealed some set Aj £ f2j such that Aj contains the final scenario A; where S7j = {S C U : IS'nPjj < fcj} 
consists of all subsets of U having at most ki elements in a designated set Pj C U (it can have any 
number of elements from U\Pi). Again, the final scenario A = n^L Aj. 

Recall that we obtain the multistage A:-robust model studied in this paper by setting Pj = U (i.e. 
ftj = (jr 7 )) on all days. We give an example showing that thrifty algorithms perform poorly for 
multistage set cover under the above 'subset fc-robust' uncertainty sets. 

Consider a universe U of elements partitioned into T parts Uj=i Pi- We consider a constant number of 
stages T. The set system consists only of singleton sets, so it suffices to talk about costs on elements. 
Fix a parameter A>T. For each i G [T], we have: 

• |p| = A m 

• Each Pj-element has cost l/A* 

• The inflation factor on day i is A* 

• h = 1, so Qi = {S C U : \SnPi\ < 1} 

We first show that the optimal value is at most T. Note that on each day i, there is at most one active 
Pj-element (i.e. |Aj D Pj| < 1). Consider the strategy that on each day i covers the unique active 
Pj-element: the worst case cost equals Yll=i A* • 1/A J = T. This strategy is feasible since \jj =1 Pi = U. 

On the other hand, we now show that any strategy that is not active on all days has cost at least 
A>T. Consider any strategy that is inactive on some day j G [T]. One the following cases occurs: 

1. Suppose that on every scenario sequence, all active Pj-elements are covered by day j — 1. Then 
consider any scenario sequence with entire Pj active on day j — 1 (this is possible since Pj G fij 
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for all i < j — 1). The total cost incurred on the first j — 1 days (on this scenario sequence) is 
at least \Pj\ ■ i = A. 

2. Suppose that there is a partial scenario sequence cr (until day j — 1) where an active Pj-element 
e remains uncovered after day j — 1. Extend a to a scenario sequence that has e active until 
the end (i.e. e is the unique Pj element that remains active after day j). Since the algorithm is 
inactive on day j, it must be that e is covered on day j + 1 or later — thus the total cost along 
this scenario sequence is at least A J+1 • -h = A. 

Setting A T+1 = Q(\U\) it follows that thrifty strate gies can be worse by a polynomial (in \U\) factor 
for multistage robust set-cover under subset fc-robust uncertainty sets. 
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